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Abstract. We present in this paper a Bayesian parameter estimation method 
for the analysis of interferometric gravitational wave observations of an inspiral of 
binary compact objects using data recorded simultaneously by a network of several 
interferometers at different sites. We consider neutron star or black hole inspirals that 
are modeled to 3.5 post-Newtonian (PN) order in phase and 2.5 PN in amplitude. 
Inference is facilitated using Markov chain Monte Carlo methods that are adapted in 
order to efficiently explore the particular parameter space. Examples are shown to 
illustrate how and what information about the different parameters can be derived 
from the data. This study uses simulated signals and data with noise characteristics 
that are assumed to be defined by the LIGO and Virgo detectors operating at their 
design sensitivities. Nine parameters are estimated, including those associated with 
the binary system, plus its location on the sky. We explain how this technique will 
be part of a detection pipeline for binary systems of compact objects with masses up 
to 20 Mq , including cases where the ratio of the individual masses can be extreme. 
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1. Introduction 

A world-wide network of interferometric gravitational wave detectors is now on-line. 
LIGO has reached its target sensitivity [HE], and Virgo is fast approaching theirs [3,11]. 
GEO [5] and TAMA [6] are also participating in the search for gravitational waves. 
Compact binary systems will certainly produce gravitational waves |7], and they are 
likely to be one of the most promising sources. 

The LIGO Scientific Collaboration (LSC) [8] and Virgo [9|, [10] each have search 
pipelines for binary inspiral events, and studies have shown that these pipelines have 
equivalent detection capabilities [11]. The LSC has conducted searches for binary 
neutron star inspirals [HI US], primordial black hole binary coalescences in the galactic 
halo [13j, and black hole binaries [1^. The LSC and TAMA have conducted a joint 
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search for binary neutron star systems |15], and soon the LSC and Virgo will be 
conducting collaborative searches p,lj. 

The purpose of a binary inspiral detection pipeline is to find a signal within the data. 
Once researchers suspect that a signal is present then parameter estimation techniques 
can be applied in order to produce estimates and summary statistics for the astrophysical 
parameters. Bayesian Markov chain Monte Carlo (MCMC) methods |T6] are well suited 
for this problem, especially since it is possible to produce accurate predictions for the 
form of the signal. MCMC parameter estimation techniques have been developed for 
binary neutron star inspirals, as seen by a single interferometer [17]. In addition, 
MCMC methods have been developed for the coherent analysis of data from a world- 
wide network of interferometers [18] . 

A difficult detection scenario involves finding a signal produced by a binary system 
where the mass ratio between the two objects is large. In such a case, the signal 
will likely have its amplitude significantly modulated (as opposed to just a 'simple' 
chirp with monotonously increasing frequency and amplitude), and it will be necessary 
to use higher-order post-Newtonian (PN) approximations. In this paper we present 
a description of our method for producing parameter estimates associated with a 
binary inspiral modeled to 3.5 post-Newtonain (PN) order in phase, and 2.5 PN in 
amplitude [T9l [21] . There are numerous goals that we wish to address with this version 
of our code. We employ new and more advanced MCMC methods, such as evolutionary 
MCMC [23]. The higher order PN templates will also allow for examination of signals 
where the amplitude is modulated, as may be the case with rather large ratios between 
the masses of the compact objects. Finally, we see this MCMC program as part of a 
larger detection pipeline for signals from binary inspirals with large mass ratios, and 
individual masses going up to 20 Mg. We imagine, for example, using an existing de- 
tection pipeline [10] to generate a reasonable number of triggers; the MCMC would 
then analyze each of the triggers in detail. Once the MCMC has reached convergence, 
an estimate for the signal parameters would be produced. In this paper we provide a 
description of the MCMC component of this detection pipeline. 

2. Inference framework 
2.1. The Bayesian approach 

We follow a Bayesian approach in order to do inference on the inspiral signal's 
parameters, since this better allows one to address the questions of immediate interest 
in such a context. Other methods (e.g. matched-filtering methods) on the other hand 
usually follow a Maximum-Likelihood approach, which does not yield as satisfactorily 
interpretable results, and does not exploit the information available in the data to the 
same extent [24]. In a Bayesian setup, information about parameters is fomulated in 
terms of probability distributions on the parameter space. First, the pre-observational 
knowledge is expressed in the prior distribution, and inference eventually is done 
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through the parameters' posterior distribution that is conditional on the observed 
data, and follows through the application of Bayes' theorem on the prior and the 
data model (likelihood). The parameters' posterior distribution then expresses the 
information about the parameters given the prior knowledge, the model, and the data 
at hand [25l [261 [27] . 

2.2. MCMC methods 

Once the Bayesian framework is set up, inference depends on evaluating the parameters' 
posterior distribution, which is given in terms of the (non-normalized) posterior density, 
in our case a function of 9 parameters. Typically, one will be interested in figures such 
as posterior means, confidence bounds, or marginal densities for individual parameters, 
which requires integration of the posterior over the parameter space. This problem is 
commonly approached using Monte Carlo integration, i.e. by simulating random draws 
from the posterior distribution, and then approximating the desired integrals by sample 
statistics (means by averages, etc.). The most popular algorithms for this purpose are 
Markov chain Monte Carlo (MCMC) samplers that simulate a random walk through 
parameter space whose stationary distribution is the posterior distribution [27|, [16] . 

Metropolis- (and related) MCMC algorithms also have nice optimization properties. 
In fact they happen to behave very similar to e.g. a Nelder-Mead algorithm which is 
extended to a simulated annealing algorithm; on its random walk through parameter 
space it will always accept an 'uphill' step, and sometimes (randomly) a 'downhill' step 
as well |28]. This property often comes in handy since the problem — as in our case — 
usually is not only to sample from the posterior, but also to first Gnd the global posterior 
mode(s) within a complex posterior surface, and among numerous minor modes. These 
convergence properties can also be enhanced through the implementation of the sampler, 
while care must be taken to maintain its ergodicity properties. 

For our purposes we used a basic Metropohs sampler that we recently upgraded to 
an evolutionary MCMC algorithm [23], a generalization that is motivated by genetic 
algorithms [29]. This extension offers substantial improvement over the previously 
employed parallel tempering [18] and yielded a sampler that reliably converged towards 
the true posterior distribution in the examples discussed below. More details on the 
implementation of the evolutionary MCMC algorithm can be found in section 12. 6[ 

2.3. Data and signal waveform 

Our simulated data consist of simultaneous measurements from several interferometric 
detectors, superimposed with interferometer-specific Gaussian noise. The signal 
waveform that was injected into and recovered from the data was implemented using a 
3.5 post-Newtonian (PN) approximation for the phase evolution [HI [30], and a 2.5 PN 
model for the amplitude ^21j. The 9 parameters determining the responses at different 
interferometers are: individual masses (mi,m2 G R"*"; mi < 7712), luminosity distance 
{dL G R^), inclination angle (i G [0,7r]), coalescence phase (0o G [0,27r]), coalescence 
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time at geocenter (tc G R), declination {6 G [— right ascension (a G [0,27r]) 
and polarization angle (-0 G [0,7r]). In order to derive the waveform at an individual 
interferometer /, the 'local parameters' altitude {'&^^^), azimuth {(f^^^) local coalescence 
time (4^^) and local polarisation (ip^^^) need to be determined for each interferometer. 
More specific definitions are given in [18], [31]. An appropriate way of dealing with the 
failure of the waveform approximation shortly before coalescence still needs to be found. 
For now we simply terminate the waveform as soon as the innermost stable orbit [19] 
is reached, or as the (3.5 PN) approximated orbital frequency starts decreasing. The 
latter usually happens first; it is a non-physical effect that has also been noticed in other 
contexts and gives an indication of the obvious failure of the waveform approximation 
[32| |33] . In the future numerical integration might be used to extrapolate still further. 
The same approach could be used with a wide range of waveforms; in previous studies 
we have used implementations of the 2.0 PN stationary-phase approximation [17], and a 
2.5 PN phase and 2.0 PN amplitude approximation [18]. We are currently also working 
on an extension to the case of spinning binaries, which entails the consideration of 
several additional parameters. 

2.4- Priors 

We applied non-informative priors on the 'geometrical' parameters that describe the 
inspiral event's location and orientation. Assuming that any direction and orientation 
is equally likely (or none of these is a priori 'preferred'), this leads to uniform priors for 
right ascension a, polarization angle ip and coalescence phase (po, and to prior densities 

f(S) = lcos{6) and /(i) = ^sin(0 (1) 

for declination 6 and inclination angle l. These also define the Maximum Entropy 
settings for these parameters [25]. The coalescence time tc is assumed to be known 
in advance up to a certain accuracy from the detection pipeline that would in reality 
precede such an analysis [TD]. For our demonstration purposes we set its prior to be 
uniform across ±5 ms around the true value (which of course is known for simulated 
data); using wider ranges only makes the search phase longer, due to the larger 
parameter space [18j. The prior for the masses (mi, reflects the distribution of the 
masses among binary inspirals, which could be based on observational evidence [34l [35] 
as well as theoretical considerations [361 133 • ^^"^ now, we simply defined it as uniform 
across a range of 1-10 Mg. In principle, this type of search will be applicable for 
component masses up to 20 Mg (which corresponds to the low frequency sensitivity 
limit for LIGO and Virgo). 

Assuming that inspirals happen uniformly across space leads to a prior P{dL < x) oc 
for the luminosity distance di. This is an improper prior, seemingly implying there 
was an 'infinite' probability for 'infinitely remote' inspiral events. It is also unrealistic, 
since an inspiral event needs to happen within a certain range in order to be detectable, 
otherwise its signal would be too faint to be noticed at all. We incorporated this 
restriction into the prior specification by considering the detection probability of an 
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inspiral event, depending on the signal-to-noise ratio (SNR). A signal's SNR increases 
linearly with its amplitude, so for the prior definition we use the amplitude as an 
approximation to the SNR. Further simplifying its expression (and considering only 
the intrinsic parameters' effects on the amplitude) we define 



A{mi,m2,dL,t) := ln(^^^^|^j + ln(^^(l + cos{Lyy + (2 cos(0)2 

>1 and <VSai2.8 

ln( ^^* I + ^Infl + 6cos'6 + cos^6), (2) 




, dL 

where rut = rrii + m2 is the total mass, and the (symmetric) mass ratio of 

the inspiralling system {A is actually proportional to the logarithmic amplitude) |17j . 
Now one could set a threshold amplitude below which the corresponding event would 
be considered undetectable, but we preferred a smoother transition that doesn't strictly 
rule out parts of the parameter space. We do so by modeling the detection probability 

Da,bix) = — (3) 

1 + exp(^) 

as a (sigmoidal) function of the amplitude value x. The values of a and b are set by 
defining at which amplitudes xl and xu the detection probability reaches some value p 
and exceeds 1 — p (where < p < 0.5, e.g. p := 0.1). Given xl and xu, these are set to 

Xl + Xu , , Xu ~ XL f.. 

a := and o := — ; — , „ , . (4) 

2 2 log(^) ^ > 

In the following, we defined p := 0.1, xjj := ^(2M0, 2M0, 50Mpc, 0) and Xl ■ = 
^(2Mq, 2M0, 60Mpc, 0), assuming that a 2-2 Mq inspiral with zero inclination is 
detectable out to distances of 50 and 60 Mpc with 90% and 10% probability, respectively, 
and providing a reasonable coverage of the parameter space for the example below. More 
realistic bounds may be specified with respect to a certain detection pipehne that is 
supposed to be installed upstream. Considerations within a similar context (long-term 
observations of pulsars' gravitational wave signals) indeed show that while detection 
of signals is certain for high amplitudes and impossible for low amplitudes, there also 
is a transition region in between where detectability is a matter of chance [381 139] . 
The detection probability then enters the prior definition as an additional factor. 
Considering 'occurrence' and 'detection' probabilities this way then leads to a proper 
prior distribution for all parameters, refiecting the knowledge about the inspiral signal 
given that it released a trigger in the pipeline. Figure [T] illustrates some marginal prior 
densities resulting from the above settings. 

One could actually explicitly use the SNR for the prior instead of approximating 
it by A. An SNR computation in general is computationally about as expensive as 
a likelihood evaluation; but once one already did the likelihood evaluation for given 
parameter values, the SNR computation would simplify, since it could partly build on 
computations done in the first step. When approximating the SNR by A, any effects of 
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Figure 1. Marginal joint prior densities for total mass m* and distance di, and 
inclination l and distance dj^. The contour line encloses a 99% credibility region. 



the mass parameters besides their effect on the overall amplitude are neglected, as well as 
the impact of the antennae patterns, i.e. the sensitivities of the individual interferometers 
with respect to the signal's sky location. 

The above definitions imply that e.g. greater masses have a greater prior probability 
(since, resulting in greater amplitudes, they are still detectable at farther distances, 
see figured]), although initially any masses were assumed to occur equally likely. An 
analogous effect is known in astronomy as the Malmquist effect; incorporating it into 
the prior definition will compensate for selection bias that would otherwise affect the 
parameter estimates [lOl HI] . 

2.5. Likelihood 

We assume that the noise is independent between different interferometers. Consequent- 
ly, the likelihood for each site can be computed individually, and the network likelihood 
then arises as the product of the individual likelihoods. Individual likelihoods are 
computed based on Fourier transforms of data and signal, and the noise spectrum, that 
is specific for each interferometer [12]. More details about the likelihood computation 
are given in jl8j . 

2.6. MCMC implementation 

We implemented the MCMC sampler as a basic Metropolis algorithm [27] [16] that first 
was extended to a parallel tempering algorithm. The 'tempering' here works as in a 
simulated annealing algorithm [28], and prevents MCMC chains from getting stuck in 
local modes of the posterior distribution. Parallel tempering then is the special case 
of a Metropolis-coupled MCMC (MCMCMC) algorithm [16], where several tempered 
MCMC chains, each at different temperatures, are run in parallel, and additional 
proposals are introduced to 'swap' parameter sets between chains [^ [T8] . This 
algorithm can be further refined by implementing elements of genetic algorithms |29j . 
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The set of parallel chains may be thought of as constituting a 'population' whose 
individuals may be crossed to form 'hybrids' that inherit properties from both 'parental' 
chains, the result being an evolutionary MCMC algorithm [23]. The 'crossovers' between 
sets of parameters were implemented as real crossovers, in which offsprings are formed 
by randomly reassembling the parental parameter sets, as well as snooker crossovers, 
in which a new offspring is proposed somewhere on the straight line connecting the 
two parental points in parameter space [H]. Internally, instead of the original mass 
parameters, the chirp mass rric = ^^_^^ly/5 and the (symmetric) mass ratio r] = ^J^^^^2 
were used, since these are easier to sample from. 



3. Example application 

We applied our MCMC routine to a simulated data set, corresponding to an inspiral 
signal that is received at three interferometers, specifically the two LIGO sites Hanford 
(LHO) and Livingston (LLO), and the Virgo interferometer near Pisa (V). The simulated 
inspiral involved masses of nii = 2 Mg and m2 = 5 Mg (chirp mass nic = 2.70 Mq, mass 
ratio r] = 0.204), observed from a distance of di = 30 Mpc at = 700 009 012.345 GPS 
seconds. For the synthesized data that we use the noise characteristics were assumed to 
match the target sensitivities for LIGO and Virgo [15]. The resulting SNRs [18] at the 
three sites were 8.4 (LHO), 10.9 (LLO), 6.4 (V), and the network SNR was 15.2. 
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Figure 2. Marginal joint posterior densities for some of the parameters. Dashed lines 
indicate the true parameter values. 



Figure [2] shows the marginal posterior distributions for several individual 
parameters in comparison to the true values for the injected signal. While some of 
the (marginal) distributions appear roughly Gaussian, others are clearly not, and some 
even possess multiple modes. This illustrates some of the strengths of a fully Bayesian 
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Table 1. Some key figures of the individual parameters' marginal posterior 
distributions, where meaningful. Mean and standard deviation illustrate location and 
spread, and the 95% central credible interval gives a range that contains the true 
parameter with 95% probability, given the data at hand. 



mean st.dev. 95% c.c.i. true unit 



chirp mass (mc) 


2.6988 


0.0043 


(2.6913, 2.7071) 


2.6976 


Mo 


mass ratio (rj) 


0.2069 


0.0092 


(0.1917, 0.2258) 


0.2041 




coalescence time (tc) 


12.3454 


0.0019 


(12.3420, 12.3491) 


12.3450 


s 


luminosity distance {di) 


31.3 


7.2 


(17.4, 43.6) 


30.0 


Mpc 


inclination angle (i) 
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0.343 


(0.160, 1.462) 


0.700 


rad 
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-0.499^ 


|o.025^ 


(-0.540, -0.457) 


-0.506 


rad 


right ascension (a) 


4.657^ 


(4.632, 4.689) 


4.647 


rad 


coalescence phase (cpo) 


2.84^ 


1.38^ 




2.0 


rad 



mean direction and spherical st.dev. (suitable for angular variables) [JS] 



approach: no approximations to the posterior's (or hkeUhood's) shape are made, an 
irregular posterior surface does not pose a problem, and the assessment of relative 
importance of multiple modes arises as a matter of course [18] . Figure [3] illustrates the 
joint distributions of two pairs of parameters. The high correlation between chirp mass 
and mass ratio indicates some degeneracy between these two parameters. Table [1] lists 
some numerical estimates for individual parameters for our specific example. 




chirp mass trie (sun masses) right ascension a 



Figure 3. Left: Marginal joint posterior density for the two mass parameters, and 
a 99% credibility region. Right: 95% and 99% credibility regions for the sky location 
against the backdrop of the night sky. Dashed lines indicate the true values. 
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4. Discussion 

We have presented a description of our coherent MCMC code for estimating nine 
parameters associated with a binary inspiral signal detected by a network of 
interferometric detectors. This program uses time-domain inspiral templates that are 
3.5 PN in phase and 2.5 PN in amphtude. New MCMC techniques, such as evolutionary 
MCMC and genetic algorithms, have been implemented in our code. The code can be 
applied to inspiral signals where the masses of the components can be as large as 20 Mq; 
inspirals with large mass ratios can also be successfully analyzed. This code is part of a 
large mass ratio inspiral detection pipeline that we are currently developing; a loose-net 
inspiral detection pipeline (using, for example, lower order PN templates) will generate 
a reasonable number of triggers, and this MCMC will then be applied to those times 
where triggers were recorded. The next logical extension of our binary inspiral MCMC 
work will be to systems with spin. The addition of spin will increase the number of 
parameters needed for the model, and consequently will increase the complexity and 
the time required to run the MCMC. This is currently an area of active research for us. 
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